Skip to content

Hybrid 3-connector fuzz test + VM_HOST env var#19

Merged
rophy merged 3 commits intomasterfrom
fuzz-hybrid-3connector
Apr 11, 2026
Merged

Hybrid 3-connector fuzz test + VM_HOST env var#19
rophy merged 3 commits intomasterfrom
fuzz-hybrid-3connector

Conversation

@rophy
Copy link
Copy Markdown
Owner

@rophy rophy commented Apr 11, 2026

Summary

  • Replace all hardcoded RAC VM IPs with ${VM_HOST} env var (auto-detected by vm-env.sh via virsh)
  • Implement hybrid 3-connector fuzz architecture: OLR (non-LOB) + LogMiner (LOB) on the "actual" side
  • Switch to rophy/debezium-server:3.5.0-2a7978c0af which includes mergeLogsByPrecedence RAC fix and debezium-config JAR fix

Test plan

  • fuzz-test.sh up — all 3 connectors start with ${VM_HOST} env var substitution
  • fuzz-test.sh run 10 — 10-minute workload, 0 errors, all connectors survive log switches
  • fuzz-test.sh validate — 55,228 events, 0 mismatches (PASS)
  • fuzz-test.sh db-check — 22 LOB extras + 4 LOB diffs (known DBZ-9615, identical on both sides)

Summary by CodeRabbit

  • New Features

    • Added dedicated LOB capture pipeline and Kafka topic for LOB events.
  • Chores

    • Parameterized database/host values to use environment variables instead of hardcoded IPs across test configs.
    • Updated Debezium server images and Docker Compose services to support the new connector.
    • Enhanced test scripts and consumers to wait for and subscribe to the new LOB connector/topic.
  • Tests

    • Improved validation, readiness checks, and environment handling for RAC test runs.

rophy added 2 commits April 11, 2026 01:38
Replace all hardcoded 192.168.122.x IPs with ${VM_HOST} env var.
VM IP is auto-detected by vm-env.sh via virsh — no manual update
needed when VM IP changes (e.g. new VM deployment).

- .properties files: use Quarkus ${VM_HOST} substitution
- docker-compose: use ${VM_HOST} in commands
- rac.sh: fail-fast with ${VM_HOST:?} instead of silent default
- db-check.py: require ORACLE_HOST, no fallback IP
- .env: remove DB_CONN (set by rac driver)
- vm-env.sh: remove stale IP mismatch validation
Switch to rophy/debezium-server:3.5.0-2a7978c0af which includes:
- debezium-config JAR fix (missing in stock 3.5.0.Final)
- mergeLogsByPrecedence RAC fix (thread-aware dedup)

Implement hybrid architecture: OLR for non-LOB tables +
LogMiner for LOB tables on the "actual" side.

- Add dbz-lob-logminer service (LOB-only LogMiner connector)
- OLR config: skip-lob-tables=1
- Consumer: subscribe to 3 topics, route olr-lob-events to OLR side
- Validator: remove KNOWN_LOB_TABLES exemption, LOB mismatches
  are real failures with hybrid setup
- fuzz-test.sh: 3-connector offset seeding, wait, logs
- Remove restart: unless-stopped, remove Beta1 JAR mounts
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 11, 2026

📝 Walkthrough

Walkthrough

Added a dedicated Debezium LogMiner connector for LOB tables and integrated it into the RAC fuzz-test stack; replaced hardcoded IPs with ${VM_HOST} across configs and scripts; tightened VM_HOST requirement; shifted validator behavior to treat LOB mismatches as real failures.

Changes

Cohort / File(s) Summary
New LOB Debezium config
tests/dbz-twin/rac/config/application-lob-logminer-kafka.properties
New Debezium LogMiner → Kafka config scoped to LOB table(s), routes to olr-lob-events, uses Kafka offset storage and JSON format without schemas.
Debezium / OLR configs
tests/dbz-twin/rac/config/application-logminer-kafka.properties, tests/dbz-twin/rac/config/application-logminer.properties, tests/dbz-twin/rac/config/application-olr-kafka.properties, tests/dbz-twin/rac/config/application-olr.properties
Replaced hardcoded Oracle host IP with ${VM_HOST} placeholder; one file added new LogMiner LOB config.
OLR JSON config
tests/dbz-twin/rac/config/olr-config.json
Added skip-lob-tables: 1 in Debezium/format settings.
Docker / compose / runtime
tests/dbz-twin/rac/docker-compose-fuzz.yaml, tests/dbz-twin/rac/fuzz-test.sh, tests/environments/rac/.env, tests/environments/rac/vm-env.sh, tests/sql/scripts/drivers/rac.sh
Added dbz-lob-logminer service and named volume; updated Debezium image references; require VM_HOST via env; changed fuzz orchestration and env handling; removed VM_HOST consistency checks.
Consumer / topics
tests/dbz-twin/rac/kafka-consumer.py, tests/dbz-twin/rac/fuzz-test.sh
Added OLR_LOB_TOPIC env/default olr-lob-events; consumer now waits for three topics and classifies OLR_LOB_TOPIC as 'olr'; seeding and readiness logic extended for new connector.
Validator logic
tests/dbz-twin/rac/validator.py
Renamed/changed LOB allowlist handling to suffix-based LOB_TABLES; removed known-LOB counter; treat LOB existence/value diffs as mismatches with new [LOB_EXISTENCE] and [LOB_VALUE_DIFF] logs.
Misc perf / prometheus
tests/dbz-twin/rac/perf/config/application-logminer.properties, tests/dbz-twin/rac/perf/config/application-olr.properties, tests/dbz-twin/rac/perf/config/prometheus.yml, tests/dbz-twin/rac/perf/docker-compose.yaml
Replaced fixed IPs with ${VM_HOST} placeholders and updated generated Prometheus header; adjusted Docker Compose Swingbench host param.
DB check script
tests/dbz-twin/rac/db-check.py
Removed default host fallback; require ORACLE_HOST/ORACLE_DSN or exit with error.

Sequence Diagram(s)

sequenceDiagram
    participant Oracle as Oracle DB (LogMiner)
    participant DebeziumLOB as Debezium LogMiner (lob-logminer)
    participant Kafka as Kafka
    participant Consumer as kafka-consumer.py
    participant Validator as validator.py

    Oracle->>DebeziumLOB: Stream LOB redo records
    DebeziumLOB->>Kafka: Publish to `olr-lob-events` topic
    Kafka->>Consumer: Deliver messages from topics (lm, olr, olr-lob)
    Consumer->>Validator: Forward events grouped by adapter ('logminer'/'olr')
    Validator->>Validator: Compare LM vs OLR events, treat LOB diffs as mismatches
    Validator-->>Consumer: Emit validation results / metrics
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I hop through logs where LOBs once hid,
VM_HOST now guides where IPs were slid.
Kafka sings topics, three streams in a row,
Validators watch closely — no more soft “we know.”
A carrot, a nod — the test garden grows. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title concisely and accurately summarizes the two main changes: implementing a hybrid 3-connector fuzz test architecture and replacing hardcoded IPs with ${VM_HOST} environment variable.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fuzz-hybrid-3connector

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/dbz-twin/rac/fuzz-test.sh`:
- Around line 431-445: Update the script header printed by action_help() to
include the new lob-logminer component and to reflect the correct workflow
ordering and RAC cleanup instruction: locate the action_help() function (or the
top-of-file help comment block it prints) and add "lob-logminer" to the
Components list, reorder the example workflow to show "down" before "up", and
add the note "Always run ./fuzz-test.sh down before ./fuzz-test.sh up in RAC
fuzz tests to ensure a clean environment" so the output of ./fuzz-test.sh help
matches the usage text updated in the logs case.

In `@tests/dbz-twin/rac/kafka-consumer.py`:
- Around line 147-159: The loop that checks Kafka topics should wait for all
entries in all_topics (LM_TOPIC, OLR_TOPIC, OLR_LOB_TOPIC) before proceeding;
currently it breaks as soon as LM_TOPIC or OLR_TOPIC exists which can mask an
implicitly created OLR_LOB_TOPIC on subscribe. Change the condition in the
polling loop that examines consumer.topics() (the block using topics =
consumer.topics()) to require every topic in all_topics to be present (e.g.,
check that set(all_topics).issubset(topics) or iterate and confirm each), only
then break and call consumer.subscribe(all_topics) and consumer.poll().

In `@tests/dbz-twin/rac/perf/docker-compose.yaml`:
- Line 47: The JDBC URL string "//${VM_HOST}:1521/ORCLPDB" silently becomes
malformed when VM_HOST is unset; update the interpolation to the required
expansion form so Compose fails fast—replace "${VM_HOST}" with a required
expansion like "${VM_HOST:?VM_HOST is not set}" in that JDBC URL entry to force
an error if VM_HOST is missing.

In `@tests/dbz-twin/rac/validator.py`:
- Line 248: The current LOB detection uses direct membership check ("is_lob =
event_table in LOB_TABLES") which fails for schema-qualified or case-variant
names; update the check to normalize event_table by stripping any schema
qualifier and surrounding quotes/backticks and then compare in a normalized case
(e.g., upper() or lower()) against a similarly normalized LOB_TABLES set. Modify
the logic around the is_lob assignment in validator.py so it derives a plain
table name (split on '.' and remove quotes) and performs case-normalized
membership against LOB_TABLES to correctly detect LOB tables regardless of
schema qualification or case differences.

In `@tests/environments/rac/.env`:
- Around line 1-2: Update the comment for VM_HOST in the rac environment file to
explicitly state that VM_HOST must be initialized by sourcing the vm-env.sh
script (i.e., callers must run or source vm-env.sh before invoking RAC flows),
and mention that rac.sh expects VM_HOST to be set or it will exit (see rac.sh
behavior around its VM_HOST check). Keep the note next to the existing DB_CONN
comment and use the exact symbol VM_HOST and filenames vm-env.sh and rac.sh so
callers know which script to run and which script relies on the variable.

In `@tests/sql/scripts/drivers/rac.sh`:
- Line 39: The strict fail-fast assignment VM_HOST="${VM_HOST:?VM_HOST is
required — source tests/environments/rac/vm-env.sh}" causes the RAC driver to
hard-fail because generate.sh doesn't source tests/environments/rac/vm-env.sh
first; change the driver to not use the "?" expansion (e.g.,
VM_HOST="${VM_HOST:-}") and instead add a guarded attempt to source vm-env.sh
(if present) or bail gracefully only when RAC execution truly requires VM_HOST
later; update the VM_HOST handling in tests/sql/scripts/drivers/rac.sh so it
either sources vm-env.sh when VM_HOST is empty or defers the fatal error to a
later, explicit check that runs inside the RAC execution path.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a4006b05-3c83-423c-8e5f-cf9f55add7d2

📥 Commits

Reviewing files that changed from the base of the PR and between 7087cc6 and 97e5c90.

📒 Files selected for processing (18)
  • tests/dbz-twin/rac/config/application-lob-logminer-kafka.properties
  • tests/dbz-twin/rac/config/application-logminer-kafka.properties
  • tests/dbz-twin/rac/config/application-logminer.properties
  • tests/dbz-twin/rac/config/application-olr-kafka.properties
  • tests/dbz-twin/rac/config/application-olr.properties
  • tests/dbz-twin/rac/config/olr-config.json
  • tests/dbz-twin/rac/db-check.py
  • tests/dbz-twin/rac/docker-compose-fuzz.yaml
  • tests/dbz-twin/rac/fuzz-test.sh
  • tests/dbz-twin/rac/kafka-consumer.py
  • tests/dbz-twin/rac/perf/config/application-logminer.properties
  • tests/dbz-twin/rac/perf/config/application-olr.properties
  • tests/dbz-twin/rac/perf/config/prometheus.yml
  • tests/dbz-twin/rac/perf/docker-compose.yaml
  • tests/dbz-twin/rac/validator.py
  • tests/environments/rac/.env
  • tests/environments/rac/vm-env.sh
  • tests/sql/scripts/drivers/rac.sh
💤 Files with no reviewable changes (1)
  • tests/environments/rac/vm-env.sh

Comment thread tests/dbz-twin/rac/fuzz-test.sh
Comment thread tests/dbz-twin/rac/kafka-consumer.py
Comment thread tests/dbz-twin/rac/perf/docker-compose.yaml Outdated
Comment thread tests/dbz-twin/rac/validator.py Outdated
Comment thread tests/environments/rac/.env Outdated
Comment thread tests/sql/scripts/drivers/rac.sh
- fuzz-test.sh: add lob-logminer to help text, show down-before-up workflow
- kafka-consumer.py: wait for all 3 topics before subscribing, fail on missing
- perf/docker-compose.yaml: use ${VM_HOST:?} for fail-fast on missing env var
- validator.py: normalize table name (strip schema, uppercase) for LOB detection
- .env: clarify VM_HOST must be sourced from vm-env.sh
- rac.sh: auto-source vm-env.sh when VM_HOST is not set
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/dbz-twin/rac/fuzz-test.sh (1)

216-225: Bound Debezium readiness log scans to avoid expensive full-log polling.

Line 218-220 reads entire container logs on every poll iteration. As logs grow, this gets slower and noisier. Use --tail (or --since) for bounded checks.

♻️ Suggested refinement
-        docker logs fuzz-dbz-logminer 2>&1 | grep -q "Starting streaming" && LM_OK=true
-        docker logs fuzz-dbz-olr 2>&1 | grep -q "streaming client started\|Starting streaming" && OLR_OK=true
-        docker logs fuzz-dbz-lob-logminer 2>&1 | grep -q "Starting streaming" && LOB_LM_OK=true
+        docker logs --tail 200 fuzz-dbz-logminer 2>&1 | grep -q "Starting streaming" && LM_OK=true
+        docker logs --tail 200 fuzz-dbz-olr 2>&1 | grep -Eq "streaming client started|Starting streaming" && OLR_OK=true
+        docker logs --tail 200 fuzz-dbz-lob-logminer 2>&1 | grep -q "Starting streaming" && LOB_LM_OK=true
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/dbz-twin/rac/fuzz-test.sh` around lines 216 - 225, The readiness loop
repeatedly calls docker logs for each container which rereads the full logs on
every iteration; modify the three docker logs invocations that set LM_OK,
OLR_OK, and LOB_LM_OK to include a bounded tail (e.g., --tail N) or since (e.g.,
--since "10s") flag so each grep only scans recent output instead of the entire
log; update the commands that check "Starting streaming" / "streaming client
started" in the loop (the lines that set LM_OK, OLR_OK, LOB_LM_OK) to use --tail
or --since to limit work and improve performance.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/dbz-twin/rac/fuzz-test.sh`:
- Around line 216-225: The readiness loop repeatedly calls docker logs for each
container which rereads the full logs on every iteration; modify the three
docker logs invocations that set LM_OK, OLR_OK, and LOB_LM_OK to include a
bounded tail (e.g., --tail N) or since (e.g., --since "10s") flag so each grep
only scans recent output instead of the entire log; update the commands that
check "Starting streaming" / "streaming client started" in the loop (the lines
that set LM_OK, OLR_OK, LOB_LM_OK) to use --tail or --since to limit work and
improve performance.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1e972a6d-df7b-44f1-9413-68d2b3a0c9fa

📥 Commits

Reviewing files that changed from the base of the PR and between 97e5c90 and 591504c.

📒 Files selected for processing (6)
  • tests/dbz-twin/rac/fuzz-test.sh
  • tests/dbz-twin/rac/kafka-consumer.py
  • tests/dbz-twin/rac/perf/docker-compose.yaml
  • tests/dbz-twin/rac/validator.py
  • tests/environments/rac/.env
  • tests/sql/scripts/drivers/rac.sh
✅ Files skipped from review due to trivial changes (1)
  • tests/environments/rac/.env
🚧 Files skipped from review as they are similar to previous changes (4)
  • tests/sql/scripts/drivers/rac.sh
  • tests/dbz-twin/rac/perf/docker-compose.yaml
  • tests/dbz-twin/rac/kafka-consumer.py
  • tests/dbz-twin/rac/validator.py

@rophy rophy merged commit b9f7e48 into master Apr 11, 2026
2 checks passed
@rophy rophy deleted the fuzz-hybrid-3connector branch April 11, 2026 02:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant